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We provide and analyze examples that counter the widely made claim that tunneling is needed 
for a quantum speedup in optimization problems. The examples belong to the class of perturbed 
Hamming-weight optimization problems. In one case, featuring a plateau in the cost function in 
Hamming weight space, we find that the adiabatic dynamics that make tunneling possible, while 
superior to simulated annealing, result in a slowdown compared to a diabatic cascade of avoided 
level-crossings. This, in turn, inspires a classical spin vector dynamics algorithm that is at least as 
efficient for the plateau problem as the diabatic quantum algorithm. In a second case whose cost 
function is convex in Hamming weight space, the diabatic cascade results in a speedup relative to 
both tunneling and classical spin vector dynamics. 


The possibility of a quantum speedup for finding the 
solution of classical optimization problems is tantalizing, 
as a quantum advantage for this class of problems would 
provide a wealth of new applications for quantum com¬ 
puting. The goal of many optimization problems can 
be formulated as finding an n-bit string Xopt that mini¬ 
mizes a given function f{x), which can be interpreted as 
the energy of a classical Ising spin system, whose ground 
state is ccopt- Finding the ground state of such systems 
can be hard if, e.g., the system is strongly frustrated, 
resulting in a complex energy landscape that cannot be 
efficiently explored with any known algorithm due to the 
presence of many local minima [1]. This can occur, e.g., 
in classical simulated annealing (SA) [2], when the sys¬ 
tem’s state is trapped in a local minimum. Provided the 
barriers between minima are sufficiently thin, quantum 
mechanics allows the system to tunnel in order to es¬ 
cape from such traps, though such comparisons must be 
treated with care since the quantum and classical poten¬ 
tial landscapes are in general different. It is with this 
potential advantage over classical annealing that quan¬ 
tum annealing [3-5] and the quantum adiabatic algo¬ 
rithm (QA), were proposed [6]. Thermal hopping and 
quantum tunneling provide two starkly different mecha¬ 
nisms for solving optimization problems, and finding op¬ 
timization problems that favor the latter continues to 
be an open theoretical question [7, 8]. To attack this 
question, in this work we focus on a well-known class 
of problems known as perturbed Hamming weight oracle 
(PHWO) problems. These are problems for which in¬ 
stances can be generated where QA either has an advan¬ 
tage over classical random search algorithms with local 
updates, such as SA [9, 10], or has no advantage [10, 11]. 
Moreover, for PHWO problems with qubit permutation 
symmetry there is an elegant interpretation of tunnel¬ 
ing in terms of a semiclassical potential [9, 12], which we 
exploit in this work. 

If the total evolution time is sufficiently long so that 
the adiabatic condition is satisfied, QA is guaranteed to 


reach the ground state with high probability [13, 14]. 
However, this condition is only sufficient, and the scaling 
of the time to reach the adiabatic regime is therefore not 
necessarily the right computational complexity metric. 
The optimal time to solution (TTSopt), commonly used 
in benchmarking studies [15] [also see the Supplementary 
Material (SM)], is a more natural metric. It is defined 
as the minimum total time such that the ground state is 
observed at least once with desired probability pd- 


TTSopt = min 

tf>0 


/ \n{l-Pd) \ 

{^\n{l-pGs{tf))) 


( 1 ) 


where t/ is the duration (in QA) or the number of single 
spin updates (in SA) of a single run of the algorithm, and 
PGs{tf) is the probability of finding the ground state in a 
single such run. The use of TTSopt allows for the possi¬ 
bility that multiple short (diabatic) runs of the evolution, 
each lasting an optimal annealing time (t/)opt, result in 
a better scaling than a single long (adiabatic) run with 
an unoptimized tf. 

Here we demonstrate that for a specific class of PHWO 
problems, the optimal evolution time tf for QA is far 
from being adiabatic, and this optimal evolution involves 
no multi-qubit tunneling. Instead the system leaves the 
ground state, only to return through a sequence of dia¬ 
batic transitions associated with avoided-level crossings. 
We also show that spin vector dynamics, which can be 
interpreted as a semi-classical limit of the quantum evo¬ 
lution with a product-state approximation, evolves in an 
almost identical manner. 

We note that PHWO problems are strictly toy prob¬ 
lems since these problems are typically represented by 
highly non-local Hamiltonians (see the SM) and thus 
are not physically implementable, in the very same sense 
that the adiabatic Grover search problem is unphysical 
[16, 17]. Nevertheless, these problems provide us with 
important insights into the mechanisms behind a quan¬ 
tum speed-up, or lack thereof. 
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The plateau problem .—We focus on PHWO problems 
with a Hamiltonian of the form: 


f{x) 


|a:| +p(|a:|) / < |x| < m, 

|x| elsewhere 


( 2 ) 


where |x| denotes the Hamming weight of the bit string 
X € {0,1}". The minimum of this function is Xopt = 
00 • • • 0. In the absence of a perturbation, i.e., _p(|x|) = 0, 
the problem can be solved efficiently classically using a 
local spin-flip algorithm such as SA, since flipping any 
‘1’ to a ‘0’ will lower the energy. Note that SA can be 
interpreted as a random walk, where the decision to walk 
left or right and flip a spin is given by the Metropolis 
update rule. 

QA evolves the system from its ground state at t = 0 
according to a time-dependent Hamiltonian 

H{s) = i(l-s)^(l-CTf)-Hs^/(x)|x)(x| , (3) 

i X 

where we have chosen the standard transverse field 
“driver” Hamiltonian H (0) that assumes no prior knowl¬ 
edge of the form of /(x), and a linear interpolating sched¬ 
ule, with s = t/tf being the dimensionless time parame¬ 
ter. 

Reichardt proved the following lower bound on the 
spectral gap for adiabatic evolutions for general PHWO 
problems [10]: 

Gap[iIP®''*(s)] > Gap[i?"”P“*(s)] - O > (4) 

where h = maxa;p(|x|) is the maximum height of the 
perturbation. Note that while the lower-bound holds for 
all perturbations, it is only non-trivial when it is positive 
for all s G [0,1]. Details of our simulations methods and 
a summary of the proof are given in the SM. 

We focus on the following “plateau” problem: 


f(x) 


U — 1 Z < |x| < M, 
|x| otherwise 


( 5 ) 


Thus, here h = u — I — 1. We note that when both l,u = 
0(1) a lower bound is not obtainable from Reichardt’s 
proof (this is explained in the SM), although numerical 
diagonalization reveals a constant gap. We demonstrate 
below that this case is nevertheless particularly hard for 
SA, and hence it is the focus of our study. 

Adiabatic dynamics .—We now demonstrate that, un¬ 
der the assumption of adiabatic dynamics, QA efficiently 
tunnels through a barrier and solves the plateau prob¬ 
lem in at most linear time, while SA requires a time that 
grows polynomially in the problem size n. 

In the adiabatic (long-time) limit, SA follows the ther¬ 
mal Gibbs state parametrized by the inverse temperature 
/?, whereas QA follows the instantaneous ground state 
parametrized by s. We quantify the distance of these 


two states from the target (the |0)®” state) by calculat¬ 
ing the expectation value of the Hamming weight oper¬ 
ator, defined as HW = ^ ~ '^i)- particular, 

when (HW) « 0, the system has a high probability of 
being in the state jO)®". 

As we tune (the annealing parameters) (3 and s, the 
plateau in Eq. (5) induces a dramatic change in (HW), of 
the order of the plateau width u and over a narrow range 
of the annealing parameters, as shown in Fig. 1(a). For 
SA, traversing the plateau to reach the Gibbs state is a 
hard problem because, as the random walker moves closer 
to Xopt, the probability to hop in the wrong direction 
of increasing Hamming weight becomes greater than the 



t/tf 

(a) 



(b) 


FIG. 1. (a) (HW) in the instantaneous quantum ground state 
state (GS), the classical Gibbs state p = /Z (Gibbs), 

and the instantaneous quantum ground state predicted from 
the semi-classical potential (SG GS), as a function of their 
corresponding annealing parameters. The sharp drop in the 
GS and SG GS curves is due to a tunneling event wherein ~ u 
qubits are flipped. Note that we use t/tf also for the Gibbs 
state, though in actuality the parameter is P, with a linear 
schedule: /3 = 0.1 -I- 5.9s. (b) The semi-classical potential 
for n = 512 and w = 6 exhibits a double-well degeneracy at 
the position s « 0.89 (solid) of the sharp drop observed in 
(a), but is non-degenerate before and after this point (dotted 
and dashed), thus leading to a discontinuity in the position 
of its global minimum. The same is observed for other u 
values we have checked (not shown). Inset: the difference in 
the position of the minimum gap from exact diagonalization 
and the position of the double-well degeneracy from the semi- 
classical potential, as a function of n, at u = 6 (log-log scale). 
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(a) (b) 



t/tf 

(c) 


FIG. 2. Diabatic QA vs SA and SVD. (a) Population Pi in the ith energy eigenstate along the diabatic QA evolution at the 
optimal TTS for n = 512 and u = 6. Excited states are quickly populated at the expense of the ground state. By t/tf = 0.5 the 
entire population is outside the lowest 9 eigenstates. In the second half of the evolution the energy eigenstates are repopulated 
in order. Inset: the eigenenergy spectrum along the evolution. Note the sequence of avoided level crossings that unmistakably 
line up in the spectrum to reach the ground state, (b) Scaling of the optimal TTS with n for u = 6, with an optimized number 
of single-spin updates for SA, and equal (t/)opt for QA and SVD. SA scales as 0(n), a consequence of performing sequential 
single-spin updates. QA and SVD both approach 0{1) scaling as n increases. Here we set pd = 0.7 in Eq. (1), in order to be 
able to observe the saturation of SVD’s TTS to the point where a single run suffices, i.e., TTSopt = (t/)opt. The conclusion 
is unchanged if we increase pd- this moves the saturation point to larger n for both SVD and QA, and we have checked that 
SVD always saturates before QA. Inset: scaling as a function of u for n = 1008. SVD is again seen to exhibit the best scaling, 
while for this value of n the scaling of QA and SA is similar (QA’s scaling with n improves faster than SA’s as a function of 
n, at constant u). (c) (HW) of the QA wavefunction and the SVD state (defined as the product of identical spin-coherent 
states) for n = 512 and m = 6. The behavior of the two is identical up to t/tf ~ 0.8, when they begin to differ significantly, 
but neither displays any of the sharp changes observed in Fig. l(a for the instantaneous ground state. Inset: the trace-norm 
distance between the QA and SVD states, showing that they remain almost indistinguishable until t/tf « 0.8. 


reverse. As we prove in the SM, consequently SA takes 
single-spin updates to find the ground-state. 

Unlike SA, to solve the plateau problem QA must tun¬ 
nel through an energy barrier. To demonstrate this, we 
first construct the semiclassical effective potential arising 
from the spin-coherent path-integral formalism [18]: 

Vsc = {0A\H{s)\0i4>) ( 6 ) 

where |0,</>) are the spin-1/2 (symmetric) coherent states 
[19]. The potential captures all the important features 
of the quantum Hamiltonian (.->): it displays a degen¬ 
erate double well potential almost exactly at the point 
of the minimum gap [see Fig. 1(b)]; the discontinuous 
change in the position of the global minimum of the po¬ 
tential gives rise to a nearly identical change in (HW) 
for the spin-coherent state [see Fig. 1(a)]. This agree¬ 
ment improves with increasing n, which is expected from 
standard large-spin arguments [20]. The dramatic drop 
in (HW) seen in Fig. 1(a) implies that ~ u qubits have 
to be flipped in order to follow the instantaneous ground 
state. Since the double-well potential becomes degener¬ 
ate at the point where this flipping happens, as seen in 
Fig. 1(b), QA tunnels through the barrier in order to 
adiabatically follow the ground state. The constant min¬ 
imum gap implies that this tunneling event happens in 
a time that is dictated by the scaling of the numerator 
of the adiabatic condition. In our case this numerator 
turns out to be well approximated by the matrix element 


of H{s) between the ground and first excited states, lead¬ 
ing to ~ 0{nP'^) in the adiabatic limit (see the SM for 
details), thus polynomially outperforming SA whenever 
u — I >2. 

Optimal QA via Diabatic Transitions .—Even though 
QA encounters a constant gap and can tunnel efficiently, 
the possibility remains that this does not lead to an op¬ 
timal TTS, since this result assumes the adiabatic limit. 
We thus consider a diabatic form of QA and next demon¬ 
strate, using the optimal TTS criterion defined in Eq. (1), 
that the optimal annealing time for QA is far from adi¬ 
abatic. Instead, as shown in Fig. 2(a), the optimal TTS 
for QA is such that the system leaves the instantaneous 
ground state for most of the evolution and only returns to 
the ground state towards the end. The cascade down to 
the ground state is mediated by a sequence of avoided en¬ 
ergy level-crossings. As n increases for fixed u, repopula¬ 
tion of the ground state improves for fixed (t/)opt, hence 
causing TTSopt to decrease with n, as seen Fig. 2(b), un¬ 
til it saturates to a constant at the lowest possible value, 
corresponding to a single run at (t/)opt- At this point 
the problem is solved in constant time (f/)opt) compared 
to the ^ 0{nP'^) scaling of the adiabatic regime. More¬ 
over, as shown in Fig. 2(c), there are no sharp changes 
in (HW), suggesting that the non-adiabatic dynamics do 
not entail multi-qubit tunneling events, unlike the adia¬ 
batic case. 

Given the absence of tunneling in the time-optimal 
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quantum evolution, we are motivated to consider a semi- 
classical limit of the evolution, particularly that of clas¬ 
sical spin-vector dynamics (SVD), which we describe in 
detail in the SM. SVD can be derived as the saddle- 
point approximation to the path integral formulation of 
QA in the spin-coherent basis [21]. The equations share 
the same qubit permutation symmetry as the quantum 
Hamiltonian, which significantly simplifies the computa¬ 
tion of the oracle [i.e., the potential (and its derivatives), 
now given by Eq. (6)] assumed for SVD. The SVD equa¬ 
tions are equivalent to the Ehrenfest equations for the 
magnetization under the assumption that the density ma¬ 
trix is a product state, i.e., p = where pi denotes 

the state of the Ah qubit. 

As we show in Eig. 2(b), the scaling of SVD’s optimal 
TTS also saturates to a constant time, i.e., (t/)opt- More¬ 
over, it reaches this value earlier (as a function of problem 
size n) than QA, thus outperforming QA for small prob¬ 
lem sizes, while for large enough n both achieve 0{1) 
scaling. As seen in the inset, SVD’s advantage persists 
as a function of u at constant n. 

The dynamics of QA in the non-adiabatic limit are 
well approximated by SVD until close to the end of the 
evolution, as shown in Eig. 2(c); the trace-norm distance 
between the instantaneous states of QA and SVD is al¬ 
most zero until t/tf « 0.8, after which the states start to 
diverge. This suggests that SVD is able to replicate the 
QA dynamics up to this point, and only deviates because 
this makes it more successful at repopulating the ground 
state than QA. 

Discussion .—For the class of PHWO problems studied 
here, we have demonstrated that tunneling is not neces¬ 
sary to achieve the optimal TTS. Instead, the optimal 
trajectory uses diabatic transitions to first scatter com¬ 
pletely out of the ground state and return via a sequence 
of avoided level crossings. This use of diabatic transi¬ 
tions is similar in spirit to those studied in Refs. [22- 
25], but there are some important differences, essentially, 
our PHWO findings hold for the standard formulation of 
QA, without any fine-tuning of the interpolation schedule 
or the Hamiltonian. However, while both the adiabatic 
and diabatic quantum algorithms outperform SA for the 
plateau problem, the faster quantum diabatic algorithm 
is not better than the classical SVD algorithm for this 
problem. These results extend beyond the plateau prob¬ 
lem: as we show in the SM even the “spike” problem 
studied in Ref. [9] —which is in some sense the antithe¬ 
sis of the plateau problem since it features a sharp spike 
at a single Hamming weight—also exhibits the diabatic- 
beats-adiabatic phenomenon, indicating that tunneling is 
not required to efficiently solve the problem. Moreover, 
SVD is faster for this problem as well. 

However, the mechanism we found here, of a “lining- 
up” of the avoided level crossings with an associated “di¬ 
abatic cascade” [seen in Fig. 2(a)], may not be generic. 
E.g., we have checked that this mechanism is absent 
in the adiabatic Grover problem with a transverse field 
driver Hamiltonian [as in our Eq. (o)], even though the 



FIG. 3. The optimal TTS for the potential given in Eq. ( ). 
QA outperforms SVD over the range of problem sizes we were 
able to check. The reason can be seen in the inset, which 
displays the ground state probability for SVD and QA for 
different annealing times tf, with n = 512. The optimal an¬ 
nealing time for SVD occurs at the first peak in its ground 
state probability {tf ~ 8.98), whereas the optimal annealing 
time for QA occurs at the much larger second peak in its 
ground state probability {tf ~ 10.91). 


Grover problem is then equivalent to a “giant” plateau 
problem: f{x) = 1 — [26]. 

It is important to note that we do not claim that 
PHWO problems are always associated with diabatic cas¬ 
cades (see the SM for a counterexample); nor do we claim 
that SVD will always have an advantage over QA. A sim¬ 
ple counterexample to the latter statement comes from 
the class of cost functions that are convex in Hamming 
weight space, which have a constant minimum gap [27]: 


2, 1x1=0 

jxj, otherwise 


(7) 


We have observed similar diabatic transitions for this 
problem as for the plateau (not shown), thus obviating 
tunneling, but find that QA outperforms SVD, as shown 
in Fig. b This occurs because the optimum TTS for QA 
occurs at a slightly higher optimal annealing time, i.e., 
there is an advantage to evolving somewhat more slowly, 
though still far from adiabatically. Thus, this is a case of 
a “limited” quantum speedup [15]. 

In summary, our work provides a counterargument to 
the widely made claim that tunneling is needed for a 
quantum speedup in optimization problems. Which fea¬ 
tures of Hamiltonians of optimization problems favor di¬ 
abatic or adiabatic algorithms remains an open question. 
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I. DERIVATION OF Eq. (1) 

Equation (1) is easily derived as follows: the proba¬ 
bility of successively failing k times is [1 — pGs(i/)]*, so 
the probability of succeeding at least once after k runs 
is 1 — [1 — pGs(i/)]^, which we set equal to the desired 
success probability pd', from here one extracts the num¬ 
ber of runs k and multiplies hy tf to get the time-to- 
solution TTS. Optimizing over tf yields TTSopt, which 
is natural for benchmarking purposes in the sense that 
it captures the trade-off between repeating the algorithm 
many times vs optimizing the probability of success in a 
single run. The adiabatic regime might be more attrac¬ 
tive if one seeks a theoretical guarantee to have a certain 
probability of success if the evolution is sufficiently slow. 


On the other hand, if f{x) = |a:| (i.e., in the absence 
of a perturbation), the Hamiltonian is only 1-local: 

Hp= |a;||a:)(a:| (12a) 

1 1 

= Y ■■■ X! (^1 H-|-a:„)|a:i)(a;i| 

Xi—0 X-n—0 

O |a:2)(a;2| O • • • O |a:„)(a;„| (12b) 

{xk\Xk){Xk\) OI E (12c) 

k=l j^k \xj=0 J 

n n 

= (i2d) 

k—l j^k k—1 



II. (NON-)LOCALITY OF PHWO PROBLEMS 


III. METHODS 


Since the PHWO problems, including the plateau, are 
quantum oracle problems, they can generically not be 
represented by a local Hamiltonian. For completeness 
we prove this claim here and also show why the (plain) 
Hamming weight problem is 1-local. 

Let r be a bit string of length n, i.e., r € {0,1}" and 
let 


cr’’ = (t[^ (g) 0-2^ (g) • • • (g) ctO" , (8) 

with tr? = li and a} = erf. This forms an orthonor¬ 
mal basis for the vector space of diagonal Hamiltonians. 
Thus: 


Hp= Y (9) 


rGfO.l}’* 


Jr = ^Tr(a”iJp) 

(10a) 


(10b) 

xGfO.l}'* 


= E /(-)(-i)"’'- 

(10c) 

xG{0,l}" 



Note that generically Jr will be be non-zero for arbitrary- 
weight strings r, leading to |r|-local terms in Hp, even 
as high as n-local. 

E.g., substituting the plateau Hamiltonian [Eq. (T')] 
into this we obtain: 




1 

2 " 


E N(-1) 


+ («-!) E (-1) 

/<|3;|<u 


( 11 ) 


A. Simulated Annealing 

SA is a general heuristic solver [2] , whereby the system 
is initialized in a high temperature state, i.e., in a ran¬ 
dom state, and the temperature is slowly lowered while 
undergoing Monte Carlo dynamics. Local updates are 
performed according to the Metropolis rule [32, 33]: a 
spin is flipped and the change in energy AE associated 
with the spin flip is calculated. The flip is accepted with 
probability PMet: 

PMet = min{l,exp(-/3A£;)} , (13) 

where /3 is the current inverse temperature along the an¬ 
neal. Note that there could be different schemes govern¬ 
ing which spin is to be selected for the update. We con¬ 
sider two such schemes: random spin-selection - where 
the next spin to be updated is selected at random; and 
sequential spin-selection - where one runs through all of 
the n spins in a sequence. Random spin-selection (includ¬ 
ing just updating nearest neighbors) satisfies detailed- 
balance and thus is guaranteed to converge to the Boltz¬ 
mann distribution. Sequential spin-selection does not 
satisfy strict detailed balance (since the reverse move of 
sequentially updating in the reverse order never occurs), 
but it too converges to the Boltzmann distribution [34]. 
In sequential updating, a “sweep” refers to all the spins 
having been updated once. In random spin-selection, we 
define a sweep as the total number of spin updates di¬ 
vided by the total number of spins. When it is possible 
to parallelize the spin updates, the appropriate metric of 
time-complexity is the number of sweeps Aswj not the 
number of spin updates (they differ by a factor of n) [15]. 
However, in our problem this parallelization is not pos¬ 
sible and hence the appropriate metric is the number of 
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spin updates, and this is what is plotted in Fig. 2(b). Af¬ 
ter each sweep, the inverse temperature is incremented by 
an amount A/3 according to an annealing schedule, which 
we take to be linear, i.e. A/3 = (/3/ — /3i)/{Nsw — !)• 

We can use SA both as an annealer and as a solver 
[35] . In the former, the state at the end of the evolution 
is the output of the algorithm, and can be thought of as a 
method to sample from the Boltzmann distribution at a 
specified temperature. For the latter, we select the state 
with the lowest energy found along the entire anneal as 
the output of the algorithm, the better technique if one 
is only interested in finding the global minimum. We use 
the latter to maximize the performance of the algorithm. 

More details concerning the performance of SA in the 
context of the problem studied here are presented below. 


B. Quantum Annealing 

Here we consider the most common version of quantum 
annealing: 


1 

i7(s) = (l-s)^-(l, -crf)-bs f{x)\x}{x\ , 

i=l xGfO.l}" 

(14) 

where s = t/tf is the dimensionless time parameter and 
tf is the total anneal time. The initial state is taken to 
be I-I-)®", which is the ground state of H{0). 

The initial ground state and the total Hamiltonian 
are symmetric under qubit permutations (recall that 
f{x) = /(|a:|) for our class of problems). It then follows 
that the time-evolved state, at any point in time, will also 
obey the same symmetry. Therefore the evolution is re¬ 
stricted to the (n-l-l)-dimensional symmetric subspace, a 
fact that we can take advantage of in our numerical simu¬ 
lations. This symmetric subspace is spanned by the Dicke 
states \S,M) with S = n/2,M = —S,—S + 1,...,S, 
which satisfy: 

52|S',M) =S'(5-bl)|5,M) (15a) 

S^\S,M) = M\S,M) , (15b) 


where = 1 = {S^f + {Syf + {S^f. 

We can denote these states by: 


\w) = 


^ , .r ^ \ 

-,M=--w) = 


- 1/2 


X'.\x\—W 


(16) 


where, w G {0,..., n}. 

In this basis the Hamiltonian is tridiagonal, with the 
following matrix elements: 

- ^(l-s)>/(’^-w^)('«' + l), (17a) 

71 

s)- +sf(w). (17b) 


The Schrodinger equation with this Hamiltonian can be 
solved reliably using an adaptive Runge-Kutte Cash- 
Karp method [36] and the Dormand-Prince method [37] 
(both with orders 4 and 5). 

If the quantum dynamics is run adiabatically the sys¬ 
tem remains close to the ground state during the evolu¬ 
tion, and an appropriate version of the adiabatic theorem 
is satisfied. For evolutions with a constant spectral gap 
for all s G [0,1], an adiabatic condition of the form 


tf > const 


\\d,His)\\ 
sG[o,i] Gap(s)^ 


(18) 


is often claimed to be sufficient [38] (however, see discus¬ 
sion after Eq. (21) in Ref. [13]). In our case ||clsi7(s)|| = 
||i7(l) — 77(0)11 is upper-bounded by n; since we are con¬ 
sidering a constant gap, the adiabatic algorithm can scale 
at most linearly by condition (18). This is true for the 
plateau problems. 

We demonstrate in SM-V that the following version of 
the adiabatic condition, known to hold in the absence of 
resonant transitions between energy levels [39] , estimates 
the scaling we observe very well: 


m„ « 1, (19) 

sG[o,i] Gap(s)2 


where eo(s) and ei(s) are the instantaneous ground and 
excited states in the symmetric subspace respectively. To 
extract tf from this condition we simply ensure that for 
a given choice of tf, condition (19) holds (recall that 
s = t/tf). 


C. Spin-Vector Dynamics 


Starting with the spin-coherent path integral formula¬ 
tion of the quantum dynamics, we can obtain Spin Vector 
Dynamics (SVD) as the saddle-point approximation (see, 
for example, p.lO of Ref. [21] or Refs. [40, 41]). It can 
be interpreted as a semi-classical limit describing coher¬ 
ent single qubits interacting incoherently. In this sense, 
SVD is a well motivated classical limit of the quantum 
evolution of QA. SVD describes the evolution of n unit- 
norm classical vectors under the Lagrangian (in units of 
h = 1): 

C = i (Q(s)I^IQ(s)) - tf (Q(s)|77(s)|Q(s)), (20) 

where |H(s)) is a tensor product of n independent spin- 
coherent states [42]: 


\m) = ^ 

i-l 


COS 


di{s) 

2 


|0). + sin(^^)e^‘^-(«)|l). . 

( 21 ) 










We can define an effective semiclassical potential associ¬ 
ated with this Lagrangian: 

Vscm},W^}.s) = {m\H{s)\n{s)) 


= (1 - s) ^ - (1 - cos (Pi{s) sin e,{s)) 

Z=1 



( 22 ) 


with the probability of finding the all-zero state at the 
end of the evolution (which is the ground state in our 

case), as n”=i ^os^ (^4^)- quantum Hamiltonian 

obeys qubit permutation symmetry: PHP = H where 
P is a unitary operator that performs an arbitrary per¬ 
mutation of the qubits. This implies that our classical 
Lagrangian obeys the same symmetry: 

£' = i(H(s)|P^P|H(s)) - t/(H(s)|PiL(s)P|H(s)) 

= ^{nis)\^J^l{s)) - tf{n{s)\Hism{s)) = C, 

(23) 

where the derivative operator is trivially permutation 
symmetric. Therefore, the Euler-Lagrange equations of 
motion derived from this action will be identical for all 
spins. Thus, if we have symmetric initial conditions, i.e., 
(0i(O), (/ 7 i( 0 )) = (%(0), :/ 9 j( 0 )) Vi, j, then the time evolved 
state will also be symmetric: 

{e,{s),(pi{s)) = {ej{s),ipj{s)) Vi,j Vs e [o, i]. (24) 

As we show below, under the assumption of a 
permutation-invariant initial condition we only need to 
solve two (instead of 2 n) semiclassical equations of mo¬ 
tion: 


7? 

- sin 6 »(s) 0 '(s) - f/5^(s)Hs^“(6»(s), (/ 3 (s), s) = 0 , 

(25a) 

77 

--sm9{s)ip'is)-tfdg(^s)Vsl'^ieis),ipis),s) = 0 , 

(25b) 


slightly abuse notation for simplicity, and use Vsc in¬ 
stead of probability of finding the all-zero bit 

string at the end of the evolution is accordingly given by 
cos^”(6>(l)/2). We would have arrived at the same equa¬ 
tions of motion had we used the symmetric spin coherent 
state in our path integral derivation, but that would have 
been an artificial restriction. In our present derivation 
the symmetry of the dynamics naturally imposes this re¬ 
striction. 

Note that the object in Eq. (22) involves a sum over all 
2 " bit-strings and is thus exponentially hard to compute; 
on the other hand, the object in Eq. (26) only involves a 
sum over n terms and is thus easy to compute. There¬ 
fore, just as in the quantum case—where due to permuta¬ 
tion symmetry the quantum evolution is restricted to the 
n -|- 1 dimensional subspace of symmetric states instead 
of the full 2"-dimensional Hilbert space—given knowl¬ 
edge of the symmetry of the problem we can efficiently 
compute the SVD potential and efficiently solve the SVD 
equations of motion. 

We also remark that the computation of the potential 
in Eq. (22) is significantly simplified if our cost function, 
/(cc), is given in terms of a local Hamiltonian. For exam¬ 
ple, if H{1) = then: 

t^sc({^'i},{<^i},l) = X!'^vcos 6 IjCos 6 'j , (27) 

i,3 

which is easy to compute if is a poly(n) number of terms. 

Let us now derive the symmetric SVD equations of 
motion (25) . Without any restriction to symmetric spin- 
coherent states, the SVD equations of motion, for the 
pair 9i,ipi, read: 

isin 6 »i(s) 6 »'(s )=0 , 

(28a) 

-^sin0,(s)(p'(s)=0 . 

(28b) 


As can be seen by comparing Eqs. (25) and (28), it is 
sufficient to show that: 


J)_ 


^SC 


0j=g,(Pj=ip Vi 


1 ^ 
n d9 




(29) 


where we have defined the symmetric effective potential 
as: 

Tl 

= (1 — s) — (1 — cosv?(s) sin 0 (s)) 




Cos2("-^) 



and an analogous statement holding for derivatives with 
respect to ip. This claim is easily seen to hold true for 
the term multiplying (1 — s) in Eq. (22): 



cos ipi{s) sin 9i{s)) 


ej=e,ipj=tp Vi 


^2 

n d9 


(1 — cosi 3 (s) sin 6 >(s)) 
Vs^r(0,</),s = O) , 


(30) 


and |H®y“(s)) is simply |H(s)) with all the 9's and 93 ’s where in the last line we used Eq. (26). Next we focus on 
set equal. Note that in the main text [see Eq. ( 6 )], we the term multiplying s in Eq. (22). This term has no ip 
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dependence and thus we only consider the 9 derivatives. 
First note that 


-^Vsc{{di}, Wi], s = l) = 



Now, we set all the 0i’s equal. Let us define p{9) = 
sin^ (I). Using this and the fact that / is only a func¬ 
tion of the Hamming weight (which is equivalent to the 
qubit permutation symmetry), we can rewrite the last 
expression, after a few steps of algebra, as: 


w—0 


W—0 


X 

0-p) 


(1 -p) 

71 —W— 1 


fn — I 

U -1 

dep -( 


n 



^ (ru — np) 


1 

n do 




(32) 


D. Simulated Quantum Annealing 

Although not discussed in the main text, in SM-V 
we use an alternative method to simulated annealing, 
namely simulated quantum annealing (SQA, or Path In¬ 
tegral Monte Carlo along the Quantum Annealing sched¬ 
ule) [43, 44]. This is an annealing algorithm based on 
discrete-time path-integral quantum Monte Carlo simu¬ 
lations of the transverse field Ising model using Monte 
Carlo dynamics. At a given time t along the anneal, the 
Monte Carlo dynamics samples from the Gibbs distribu¬ 
tion defined by the action: 

A[m] = A(t)^i?p(/r,.)- J±(t) E Pi,TPi,T+l (33) 

r 2,T 

where A(t) = /3B{t) /Nr is the spacing along the time-like 
direction, J± = —0.51n(tanh(A(t)/2)) is the ferromag¬ 
netic spin-spin coupling along the time-like direction, and 
/i denotes a spin configuration with a space-like direction 
(the original problem direction, indexed by i) and a time¬ 
like direction (indexed by r). For our spin updates, we 
perform Wolff cluster updates [45] along the imaginary- 
time direction only. For each space-like slice, a random 
spin along the time-like direction is picked. The neigh¬ 
bors of this spin are added to the cluster (assuming they 
are parallel) with probability 

P = l- exp(-2J_L) (34) 


cluster have had their neighbors along the time-like di¬ 
rection tested, the cluster is flipped according to the 
Metropolis probability using the space-like change in en¬ 
ergy associated with flipping the cluster. A single sweep 
involves attempting to update a single cluster on each 
space-like slice. 

We can use SQA both as an annealer and as a solver 
[35]. In the former, we randomly pick one of the states 
on the Trotter slices at the end of the evolution as the 
output of the algorithm, while for the latter, we pick 
the state with the lowest energy found along the entire 
anneal as the output of the algorithm. We use the latter 
to maximize the performance of the algorithm. 


IV. REVIEW OF THE HAMMING WEIGHT 
PROBLEM AND REIGHARDT’S BOUND FOR 
PHWO PROBLEMS 


Here we closely follow Ref. [10]. 


A. The Hamming weight problem 

We review the analysis within QA of the minimization 
of the Hamming weight function fnwix) = jxj, which 
counts the number of I’s in the bit string x. This problem 
is of course trivial, and the analysis given here is done in 
preparation for the perturbed problem. 

For the adiabatic algorithm, we start with the driver 
Hamiltonian, 


^ IL IL 

^^ = bE(l*-<) = El-)^(-| ’ (35) 

i=l i=l 

which has [-|-)®” as the ground state. 

The final Hamiltonian for the cost function is 


^ IL IL 

= = ’ (36) 

i=l i=l 


which has jO)®" as the ground state. 

We interpolate linearly between Hp and Hp-. 


H{s) = (1 - s)Hd + sHp- 



1 — s — 
(1 - s) 


s G [0,1] 




Al/ 1-s -(l-s)\ 

^2 'v-(l-«) l + « )^ ■ 


(37) 

(38) 

(39) 


Since there are no interactions between the qubits, this 
problem can be solved exactly by diagonalizing the 
Hamiltonian on each qubit separately. For each term, 
we have the energy eigenvalues i?±(s), 


When all neighbors of the spin have been checked, the 
newly added spins are checked. When all spins in the 


F;±(s) = i(l±A(s)); A(s) = Vl-2s + 2s2, (40) 
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and associated eigenvectors 

1 


|n±(s)) = 


v/2A(Ats) 


[t(Ats)|0) + (1-s)|1)] 


(41) 

The ground state of H{s) is |r;_(s))®". The gap is given 

by, 

Gap[il(s)] = H{s)\v+{s)) (g) 

- H{s)\v-is))^^ (42a) 

= E+ + {n- 1)E_ - nE_ (42b) 

= E+-E_ (42c) 

= A(s) . (42d) 

The gap is minimized at s = ^ with minimum value 
A(b) = The minimum gap is independent of n and 
hence does not scale with problem size. Therefore the 
adiabatic run time is given by, 


tf=0 


1^ 

A2 


= 0{n) 


(43) 


where the n-dependence is solely due to ||iJ|| (see SM- 
III B). 

It also useful to consider the form of |u_(s))®". We 
can write, 


y_(s))®” = 


1 


[2A(A + s)] 2 

(l-s)l"l(A(s)+s)"-l"l|x) . (44) 

a:G{0,l}’* 

If we measure in the computational basis, the probability 
of getting outcome x is determined by |x|: 

Pr[a;](s) = = g(s)l"l(l - g(s))”-l"l , (45) 

where 

{l-sf 


q{s) = 


[2A(A + s)] 


(46) 


B. Reichardt’s bound for PHWO problems 


Here we review Reichardt’s derivation of the gap lower- 
bound for general PHWO problems, but provide addi¬ 
tional details not found in the original proof [10]. 

We use the same initial Hamiltonian [Eq. (35)] and lin¬ 
ear interpolation schedule as before, H{s) = (1 — s)Hd + 
sE[p, and choose the final Hamiltonian to be 

Hp= Y .fix)\x){x\ , (47) 


where 


fix) 


\x\ +pix) I < \x\ < u , 
[a:] elsewhere 


(48) 


where p(x) > 0 is the perturbation. Note that here we 
have not assumed that the perturbation, p{x), respects 
qubit permutation symmetry. 

We wish to bound the minimum gap of H{s). Unlike 
the Hamming weight problem H{s), this problem is no 
longer non-interacting. Define 

hk = maxp(x); h = max/ifc = maxp(x). (49) 

\x\=k k X 

Lemma 1 ([10]). Let u = 0{l) and let Eq{s) and Eois) 
be the ground state energies of Ed[s) and H{s), respec¬ 
tively. Then Eois) < Eo{s) 

Proof. First note that 

H{s)-H{s)=s Y Pix)\x){x\ . (50) 

x:l<.\x\<.u 

Below, we suppress the s dependence of all the 
terms for notational simplicity. We know that Eq = 
{v^^\H\v'^^). Using this. 


{Eo\H\Eo) < VJV') e H. (51a) 

^ Eo-Eo< {v^'^\H\v^^) - Eo (51b) 

< {v®^\H - H\v^^) (51c) 

= s 'Y^ p(x) |(u®"'|x) 1^ (51d) 

x:l<.\x\ <.u 


= s Y P(^)<?'“'(1(51e) 

x:l<.\x\ <.u 

^ E -?)”■"> ( 5 if) 

k:l<k<u ^ 

where (^) is the number of strings with Hamming weight 
k, and we used Eq. (15). 

Consider the partial binomial sum (dropping the ’s), 
E (52) 

k:l<k<u ^ ^ 


Using the fact that the binomial is well-approximated by 
the Gaussian in the large n limit (note that this approx¬ 
imation requires that q{s) and 1 — q{s) not be too close 
to zero), we can write: 






TTCT 


20-2 


r{u-p)/a 


/(i-/i)/cr 


dt (pit) 


(53) 


_ _ —t ^/2 

where p, = nq, a = ^Jnq{\ — q) and (pit) = 

Note that a and p depend on n, and also on s via q(s). 
The parameters I and u are specified by the problem 
Hamiltonian, and are therefore allowed to depend on n 
as long as l{n) < u{n) < n is satisfied for all n. 
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Let us define: 


B{s,n,l{n),u{n)) = 


f*{u{n) — fi{n,s)) / (T{n,s) 


dt 


/2 


' (Z(n)—/j.(n,s))/cr(n,s) 


(54) 

We seek an upper bound on this function. We observe 
that q{s) decreases monotonically from ^ to 0 as s goes 
from 0 to 1. Thus, the mean of the Gaussian fi{n,s) = 
nq{s) decreases from ^ to 0. Depending on the values of 
Z(n), u{n) and /i(n, s), we thus have three possibilities: (i) 
l{n) < /r(n, s) < u(n), (ii) s) < l{n) < u{n), and (iii) 
l(n) < u{n) < fi{n,s). Note that (ii) and (iii) are cases 
where the integral runs over the tails of the Gaussian and 
so the integral is exponentially small. We focus on (i), as 
this induces the maximum values of the integral. In this 
case the lower limit of the integral Eq. (54) is negative, 
while the upper limit is positive. Thus, the integral runs 
through the center of the standard Gaussian, and we can 
upper-bound the value of the integral by the area of the 
rectangle of width and height . Hence 


i?(s, n, l{n), u{n)) < 


< 


1 u{n) — l{n) 
cr(n, s) 

1 u(n) — l(n) 

- q{s))' 

1 u{n) — l{n) 

\/l{n)(l - q{s)) ’ 


(55a) 

(55b) 

(55c) 


where we have used the fact that l{n) < ^(n, s) = nq{s). 
Thus, we obtain the bound: 


Eq — Eq ^ O [ h 


a — I 

vr 


(56) 

□ 


Lemma 2 ([10]). If H — H is non-negative, then the 
spectrum of H lies above the spectrum of H. That is, 
Ej > Ej for all j, where Ej and Ej denote the jth largest 
eigenvalue of H and H, respectively. 

This can be proved by a straightforward application of 
the Gourant-Fischer min-max theorem (see, for example, 
Ref. [46]). 

Gombining these lemmas results in the desired bound 
on the gap: 

Gap[iL(s)] = Ei- Eq, (57a) 

> El- Eq, (57b) 

= Ei-Eq- (Eq - Eq), (57c) 

>A-o(. 5^), (57d) 

where in Eq. (57b) we used Lemma 2 and in Eq. (57d), 
we used Lemma 1. 

Now, if we choose a parameter regime for the perturba¬ 
tion such that = o(l), then the perturbed problem 


maintains a constant gap. For example, if I = 0(n) and 
h(u — 1) = for any e > 0, then the gap is 

constant as n —^ oo. 


V. ADIABATIC SCALING 

In order to study the adiabatic scaling, we consider the 
minimum time tq required to reach the ground state with 
some probability pxhCj where we choose PThC to ensure 
that we are exploring a regime close to adiabaticity for 
QA. We call this benchmark metric the “threshold crite¬ 
rion,” and set pxhc = 0.9. As seen in Fig. 4, QA scales 
polynomially, approximately as n^'^. It is also clear that 
the adiabatic criterion given by Eq. (19) provides an ex¬ 
cellent proxy for the scaling of QA. 

In light of a spate of recent negative results concerning 
the possibility of an advantage of SQA over SA (e.g., 
Ref. [8]), it is remarkable, and of independent interest, 
that SQA scales better than SA for the plateau problem. 



FIG. 4. Log-log plot of the scaling of the time to reach a 
success probability of 0.9, as a function of system size n and 
u = 6, for QA, SQA (/3 = 30, Nr = 64) and SA (/3/ = 
20). The time for SQA and SA is measured in single-spin 
updates. We also show the scaling of the adiabatic condition 
as defined in Eq. (19) since it shows the same scaling as QA 
but can be calculated for larger spin systems. QA and the 
adiabatic condition scale approximately as n° ®. SQA scales 
more favorably (~ n^'®) than SA (~ n®). 


VI. ANALYSIS OF SIMULATED ANNEALING 
USING RANDOM SPIN SELECTION 

Here we analyze SA for the plateau and the Hamming 
weight problems. We consider a version of SA with ran¬ 
dom spin-selection as the rule that generates candidates 
for Metropolis updates. 

An example of the plateau is illustrated in Fig. o, with 
perturbation applied between strings of Hamming weight 
3 and 8. Suppose we start from a random bit-string. For 
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large n, with very high probability, we will start at a bit¬ 
string with Hamming weight close to n/2. The plateau 
may be to the left or to the right of n/2; if the plateau 
is to the right, then most likely the random walker will 
not encounter it and fall quickly to the ground state in 
at most 0{n^) steps (see a few paragraphs below for a 
derivation). 



FIG. 5. Z = 3,u = 8 

Thus, the interesting case is when the random walker 
arrives at the plateau from the right. In this case, how 
much time would it take, typically, for the walker to fall 
off the left edge? It is intuitively clear that traversing 
the plateau will be the dominant contribution to the time 
taken to reach the ground state, as after that the random 
walker can easily walk down the potential. As we show 
later below, this time can be at most 0{n^) (ignoring 
transitions which take it back onto the plateau) for an 
inverse temperature that scales as /3 = H(logn). 

To evaluate the time to fall off the plateau, let us model 
the situation as follows. First, note that the perturbation 
is applied on strings of Hamming weight Z-l-1, l-\-2, ..., u— 
1, so the width of the plateau is w = u — l — 1. Consider a 
random walk on a line of ic -I- 1 nodes labelled 0, 1, ... rc. 
Node i represents the set of bit strings with Hamming 
weight l + i, with 0 < i < w. We assume that the random 
walker starts at node w, as it is falling onto the right edge 
of the plateau. Only nearest-neighbor moves are allowed 
and the walk terminates if the walker reaches node 0. 

Our model will estimate a shorter than actual time 
to fall off the left edge, because in the actual PHWO 
problem one can also go back up the slope on the right, 
and in addition we disallow transitions from strings of 
Hamming weight I to Z -|- 1. This is justihed because 
the Metropolis rule exponentially (in f3) suppresses these 
transitions. 

The transition probabilities Pi^j for this problem can 
be written as a (ic -I- 1) x (w -I- 1) row-stochastic matrix 
Pij = Pi^j . P is a tridiagonal matrix with zeroes on the 
diagonal, except at poo and Pww First consider 1 < i < 
w — 1. If the walker is at node i, then its Hamming 
weight is l + i. Thus walker will move to i -I- 1 (which 
has Hamming weight / -|- Z -|- 1) with probability 


(the chance that the bit picked had the value 0). Now 
consider, 1 < i < w the Hamming weight will decrease 
to I + i — 1 with probability ^ (the chance that the bit 
picked had the value 1). Combining this with the fact 
that a walker at node 0 stays put, we can write: 

f 1 if i = 0 

bi = pi^i = < 0 if 1 < z < (w - 1) , 

0 if z = 1 

1 _ 1+^ ifz = 2,...,zz; 

Gi = Pi^i -1 = if z = 1, 2,..., zz;. 

n 

Let X(t) be the position of the random walker at time- 
step t. The random variable measuring the number of 
steps taken by the random walker starting from node r 
would to reach node s for the first time is 



(58a) 

, (58b) 

(58c) 


Tr s= min{t > 0 : X{t) = s,X{t — 1) ^ s|-^(0) = ■ 

(59) 

The quantity we are after is Erm^o, the expectation value 
of the random variable Tw,o, i.e., the mean time taken 
by the random walker to fall off the plateau. Since only 
nearest neighbor moves are allowed we have 

W 

ETu,,o = '^Evr^r-l ■ (60) 

r—1 


Stefanov [47] (see also Ref. [48]) has shown that 




s=r+l 


at 


(61) 


where = 0. Evaluating the sum term by term: 
n 


Et.U;^UI —1 - 


I + w' 
n 

I + w — 1 


1 + ' 


Et '„,_2 ID — 3 — 

I + w — 2 


. — {I + w — 1) 
I + w 

. — (^l + w — 2) 
I + w — 1 


(62a) 


(62b) 


n — {I + w — 2) n — {I + w — 1) 

X 


I + w — 1 


I + w 


(62c) 


^'^w — k,w — k—l — 


I + W — k 


1 + 


n — (I + w — k) 
I + w — {k — 1) 


n — (I + w — k) 

I + w — {k — 1) 

n — {I + w — 2) n — {I + w — 1) 

X- z - X 


l + w — 1 


I + w 


(62d) 


Now consider the following cases: 
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1. l,u = 0(1): Here, using the fact that k = 0{w) = 

0 ( 1 ), we conclude that ¥.T^-k,w-k-i = 0 (n^+^). 
Since the leading order term is = 

Eri^O) the time to fall off the plateau is 0{n'^) = 
0 (n“-'-i). 

2. For Reichardt’s bound to give a constant lower- 
bound to the quantum problem, we need u = I + 

Since at most we can have I = 0{n), we can 
conclude ETi„_fc,u,_fc_i = O (”) . Therefore, the 

time to fall-off becomes Eti„_o = O (w(")’"). 

• If Z = 0(n) and w = 0(1), we can see that 
E'bij.o = 0 ( 1 ), which is a constant time scal¬ 
ing. 

• If Z = 0(n) and w = 0{n°‘), where 0 < 
a < 1/4, then Er^^g = 0(n“0(l)"’“), which 
is super-polynomial. 

• More generally, if I = 0{n}’), with 6 < 1 and 
w = 0(n“), where 0 < a < 6/4, then we get 
the scaling Er^^.o = 0 (n“ 0 (n^“^)"'*) 

Analysis of SA for plain Hamming weight — Let us 
analyze the behavior of a fixed temperature, i.e., there is 
no annealing schedule, simulated annealing on the plain 
Hamming weight problem. Here the transition probabil¬ 
ities are: 


c, = ^ , (63a) 

n 

a* = = - , (63b) 

n 

with i = 1,2,... ,n denoting strings of Hamming weight 
i, and /3 is the inverse temperature. Using the Stefanov 
formula (61), we can write (after much simplification): 


Et.^—/ c,n—fc —1 — 


n n 


n — k \k 


-1 k 


1=0 


E'-'dt.,)- (6“) 


Thus, 

_ V-> ft 

Et„,o = V- 

n — 


n—1 / \ —1 k 

n In 




k\k 


E< 

1=0 


-W 


k-l ■ 


(65) 


This is the worst-case scenario as we are assuming that 
we start from the string of Hamming weight n, which is 
the farthest from the all-zeros string. Note that if we 
start from a random spin configuration, then with over¬ 
whelming probability, we will pick a string with Ham¬ 
ming weight close to n/2. Thus, most probably, Er„/ 2 ,o 
will be the time to hit the ground state. We can write 
this as: 


We first show that (3 = 0(1) will lead to an exponential 
time to hit the ground state. We show this by showing 
that Eti 0 is exponential if /3 = 0(1). To this end. 


(n—l),n—n 

n—1 


-10 


-E' 

1=0 

= [{e-^ + 1)" - 1] 


n 

n — 1 — I 


(67a) 

(67b) 

(67c) 


which is clearly exponential in n if /3 = 0(1). Now let 
us suppose we have (3 = log n, i.e. we decrease the tem¬ 
perature inverse logarithmically in system size. In this 
case. 


Eti^o = n - 



< n{e — 1 ) = 0 (n) . 


( 68 ) 


Now it is intuitively clear that Eti^q > Er^^r-i for all r > 
1, which implies that nEri g > Er„ g. Thus, if /3 = logn, 
then ET„_g = O(n^) at worst. 

To obtain a lower-bound on the performance of the 
algorithm, we take /? —>■ oo. Thus, for each k in Eq.(65), 
only the I = 0 term will survive. Thus, 


n—1 


lim Er„_o = 
^—>■00 


E 


n 

n — k 


= n 


" ^ 1 

« n(logn-I- 7 ) , (69) 

^ 1 


for large n, with 7 as the Euler-Mascheroni constant. So 
the scaling here is O(nlogn). This is the best possi¬ 
ble performance for single-spin update SA with random 
spin-selection on the plain Hamming weight problem. 
Therefore, if /? = il(logn), the scaling will be between 
O{nlogn) and 0{n^). 

To conclude, let us make a few remarks on the 
three different benchmarking metrics used in this work: 
(i) TTSopt, (ii) the mean time to first hit the ground 
state (Er„/ 2 ,o or Er-n^g), and (iii) the time to cross 
a particular fixed threshold probability (typically high, 
say pThC = 0.9) of finding the ground state (rThc). 
In order to compare the three, we would need to find 
TTS(t/ = Et„/ 2 ,o) 9 'rid TTS(t/ = Txhc). By definition, 
TTSopt, will be the smallest of the three. Further note 
that typically PGs{tf = Er„/ 2 ,g) < PThC- This implies 
that if tf = Er„/ 2 ,o, then the algorithm would need to 
be repeated more times than ii tf = TThC to obtain the 
same confidence that we have seen the ground state at 
least once. Note that either could have the smaller TTS, 
depending on the problem at hand. 

We remark that the analysis performed above for ran¬ 
dom spin-selected SA with the complexity metric as 
EAi/ 2 ,o> captures extremely well the numerically ob¬ 
tained scaling of sequential spin-selected SA with the 
complexity metric as TThC. 


VII. OTHER PHWO PROBLEMS 


]Er„/2,g = 


n n 


k—n{2 


n — k\k 


-I k 


E' 

1=0 


-10 


n 

k-l 


( 66 ) 


In this section we consider several other versions of the 
PHWO problem. The first two examples exhibit diabatic 
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(b) 



FIG. 6. (a) The optimal TTS for the spike problem [9]. Inset: 
the optimal TTS for small problem sizes, where we observe 
SVD at first scaling poorly. However, as n grows, this diffi¬ 
culty vanishes and it quickly beats QA. (b) We observe similar 
diabatic transitions for this problem (shown is n = 512 and 
tf = 9.85) as we observed for the plateau [Fig. 2(a)], although 
here the success probability appears to saturate faster than 
for the plateau problem. 


cascades, while the last does not. 

1. The “spike” problem studied by Farhi et al. [9] has 
the following cost function: 


f{x) 


n, if |a;| = f, 
|a:|, elsewhere 


(70) 


This too is a problem designed explicitly to stymie 
SA (in Ref. [9] it is argued that SA will take expo¬ 
nential time) and has polynomially decreasing gap 
(and thus will have some polynomial run¬ 
time in the adiabatic regime). In Fig. 6 we show 
that this problem too shows diabatic cascades and 
a corresponding outperformance by SVD. 


FIG. 7. (a) The optimal TTS for the plateau with l{n) = n/4 
and u{n) = l{n) + 6. As can be seen, SVD is much better than 
QA for these problem sizes, (b) We observe similar diabatic 
cascades for this problem (shown is n = 512 and tf = 10) as 
we observed for the plateau problem [Fig. 2(a)]. 


problem has a constant lower-bound for QA by Re- 
ichardt’s theorem [see Eq. (4)], and SA is able to 
solve it in constant time [recall the discussion be¬ 
low Eq. (62)]. In Eig. we see that this prob¬ 
lem too exhibits diabatic transitions for QA and 
an advantage for SVD. For this problem, as we 
show in SM-VIII, the semiclassical effective poten¬ 
tial asymptotically becomes identical to the unper¬ 
turbed Hamming weight problem, which explains 
why the TTSopt for this is decreasing: the TTSopt 
for the (plain) Hamming weight problem is con¬ 
stant. 


3. Consider the following class of PHWO problems, 
introduced in Ref. [11]: 


fix) 


p{\x\), |a;|>(i-be) 

|a;|, |a;|<(i+e) 


(71) 


2. We pick an instance of the plateau with I = n/A 
[i.e., 0{n)\ and u = I + & [i.e., I + 0(1)]. This 


where e > 0 and p{-) is a decreasing function 
which attains the global minimum, —1, in the 
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|x| > ( 5 +e) region. Ref. [11] proved that this 
class of problems has an exponentially decreasing 
gap, and therefore the adiabatic algorithm would 
take exponentially long to find the ground state. 
We have considered the following instance of this 
class: 

' 1 “"' ■ 

[ l^l, otherwise 

In this case, we did not observe the diabatic tran¬ 
sition phenomenon (not shown), i.e., the optimal 
TTS is achieved by evolving adiabatically and re¬ 
maining in the ground state. Thus the diabatic 
transition phenomenon does not persist for all 
PHWO problems. 

VIII. ASYMPTOTIC BEHAVIOR OF 
SEMICLASSICAL EFFECTIVE POTENTIALS 


for the cases I = 0{n),u = I + 0(1) and I = 0(n),u = 
I + the perturbation to the semiclassical ef¬ 

fective potential vanishes asymptotically. This shows 
that the effective potential leads to equivalent conclu¬ 
sions about computational hardness as the gap analysis. 
IX. BEHAVIOR OF THE AVERAGE HAMMING 

WEIGHT ON THE CLASSICAL GIBBS STATE 

In this section we expand on the behavior of the av¬ 
erage Hamming weight (HW) for different cases of the 
plateau problem. 

In Fig. 1(a), we plotted (HW) for the classical Gibbs 
state as a function of the inverse temperature, /3. We in¬ 
terpreted the sharp drops in this quantity as a sign that 
the problem becomes hard for SA. To understand this 
better we can consider these sharp drops as modifica¬ 
tions of the smooth behavior [see Fig. 8 (a)] of the plain 
Hamming weight function, i.e., Eq. (2) with p(|a:|) = 0. 
For this case: 


Here we analyze the behavior of the (symmetric) effec¬ 
tive potential we found in Eq. (26) and write down here 
in simplified form: 

Vsci0,‘P,s) = {0,(p\H(s)\0,(p), 

TL 

= —(1 — s)(l — sin 0 cos ip) 

(73) 

where p{0) = sin^ (|). We take f(w) to be a PHWO 
Hamiltonian of the form of Eq. (2). We can write the 
plateau’s effective potential as: 

vp -‘ = v 3 - p -‘+5 ^ m(^\{0)\i-pi0))-^ 

l<k<u ^ ' 

. 

Note the resemblance between the perturbation in the 
above equation and the term that appears in Reichardt’s 
lower-bound [see Eq. (51f)]. The only difference is that 
we have replaced q(s) with p(0). Now, if we trace through 
the arguments deriving the lower-bound on the gap, we 
see that the same holds for the perturbation term here. 
In particular: 


Therefore, when l,u = 0(1), the semiclassical effective 
potential asymptotically maintains a perturbation rela¬ 
tive to the unperturbed problem. On the other hand. 


A(/3) ^ (HW)h,„^, = (76) 

In order to study just the “drop,” we subtract A{/3) as the 
“background,” and focus our attention on the “signal,” 
which is the sharp change. 

We consider the following two varieties of the plateau: 

1. I, u = 0(1). This was the case studied in the 
main text, and we proved above that in this case 
SA requires polynomial time. As we can see from 
Eig. 8 (b), the sharpness of the drop remains con¬ 
stant with increasing n. This is consistent with the 
problem being hard for SA. 

2. I = 0(n) and u = I + 0(1). Reichardt’s lower 
bound applies in this case, and we proved above 
that this case is solved in time 0(1) by SA. As 
can be seen in Eig. 8 (c) the sharpness of the drop 
decreases (albeit slowly) with n. This is consistent 
with the problem being easy for SA. 

To conclude we remark on another case which has 
constant gap lower-bound by Reichardt’s proof. Here, 
I = 0(n),u = I + ©(n^/'^”'’), e > 0. We do not find any 
dramatic changes in the instantaneous quantum ground 
state during the evolution as in Eig. 1(a), suggesting that 
multi-qubit tunneling does not play a significant role, 
hence making it a less relevant problem for our discus¬ 
sion. 


p(0)’^(l-p(0))-'^=o(^h'^y (75) 
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FIG. 8. (a) (HW) in the Gibbs state of the plain Hamming weight function and the plateau function with I = 0 and u = 26 for 
n = 128. The two functions agree closely except in the region of the “drop.” (b) The “signal” (HW) — A{/3) for I = 0,u = 26, 
for n = 128, 256, 512,1024. The same sharp drop is seen for all n. (c) (HW) — A{P) for the case l{n) = n/4, u(n) = l{n) + 26, 
for n = 256, 512,1024, 2048, 3200,4096. Here the drop is slowly decreasing with n. 




























